-
-
Notifications
You must be signed in to change notification settings - Fork 7.8k
markup: add --citeproc to pandoc converter #9953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@shoeffner thanks for this. This PR is certainly easier to reason about (because of its size) than some of the others I've seen. That said, the test failure on Linux shows that this needs some rethinking, as that flag is not available in the version we test on, fetched via:
I assume this flag was added recently? I'm not sure how to handle this. |
The Linux build seems to use pandoc 2.5, but --citeproc is only supported > 2.11 (https://pandoc.org/releases.html#pandoc-2.11-2020-10-11 – on macOS, 2.18 is installed, so I didn't notice) – before If I remember correctly, pandoc-citeproc had to be installed manually, but I will check it out in a container and see what I can do! |
choco and brew seem to install 2.18, so Windows and macOS should not need any specific handling. On Linux we have everything from 2.5 (Ubuntu 20.04, e.g. GH actions) to 2.17 (Arch Linux); the most common version seems to be 2.9 (Debian stable, Ubuntu 22.04, and other distros), here This makes the pull request much more complicated than what I was aiming for, so feel free to close it again. My current approach is now:
If the citations are not supported (due to the version or pandoc-citeproc dependency), SupportsCitations is false and the corresponding tests are skipped. I also added a remark to the docs. An alternative could be to check for the version and ignore pandoc-citeproc. This would make it much simpler and whoever wants to use citeproc might also want to use other pandoc features anyways. But it puts some burden on the users who have to install a newer pandoc version. What do you think? |
c71d9f8
to
c000c6c
Compare
I rebased the commits and squashed them. |
I updated the PR again, I made a mistake ( |
Rebased on master. |
7dfa143
to
2f5412d
Compare
Sorry for the late reaction. I fixed it to use |
Rebased on master. |
Okay, comparing strings 2.5 and 2.11 was a stupid idea... Now it has a proper type and a function to compare two instances. 3c5e31f#diff-f662bf93836a7230b77cdb532095a04622eb9eda3054124e8ba9399786870efaR88-R94 Additionally, I included some test cases for those comparisons. I hope it will work now ;-) |
Unsupported pandoc should now cause the tests to SKIP instead of causing a panic. |
This PR has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help. |
Is this still relevant to Hugo? I have to admit that I don't even use it right now, so I haven't followed up anymore. |
I still haven't figured out a good way to make this work in a different way. For me, just having hugo pass So I would love for this PR to be rebased! |
The docs moved around quite a bit over the last 1.5 years, so I moved the section about citations to its own markdown file during the rebase. Please let me know if I should change that (or, for example, the weight) – or simply change it. I also found out that the test setup (or provider config) seems to have changed, I am currently working on updating that. |
I found the issue in the test setup (the ProviderConfig was not passed correctly, and I seem to have written that bug the last time I rebased in August '23, whoops). There's also been some output format changes in Pandoc, so I will relax the tests a bit and make sure they only check for "contains" on the references, not the exact markup (as that will be different between Pandoc versions). |
Done, the tests no longer rely on a specific output format and only check if the author last names, the year, and a word from the title are included in the documents – or not included if no citations was there. |
I received an email that the tests have failed, but I think that's what I already fixed in the latest commit. |
Just tested this rebase and it works fine; I built with My current workflow with regular extended Hugo:
touch .dummy;
ls -1 ${DRAFT_DIR}/*.md \
| xargs basename \
| xargs -i pandoc-${PANDOC_VERSION}/bin/pandoc \
-C -t gfm --resource-path=${DRAFT_DIR} -H .dummy \
${DRAFT_DIR}/{} -o ${POST_DIR}/{}
Workflow with this PR:
Much easier and no preprocessing needed! |
@shoeffner I'm not in a position to judge the value of this addition, but I do have a question. Instead of all of the version checking code, can you just test for success? pandoc --citeproc --dump-args
echo $? # 0 (success)
pandoc --foo --dump-args
echo $? # 6 (failure) |
can you just test the exit code
Sounds plausible, I'll give it a try!
Add `markup: pandoc` to front matter
Hm, should I add that to the documentation about citations I added? I think
the test examples work without that.
Thanks for trying it out!
|
To clarify, this is because I did not set |
Thanks, that makes sense!
…On Fri, Feb 21, 2025, 15:02 alametti ***@***.***> wrote:
Add markup: pandoc to front matter
To clarify, this is because I did not set pandoc as the global markup
engine, only for select content.
—
Reply to this email directly, view it on GitHub
<#9953 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOAOD4QJBJBTDORBBXNEU32Q4WXNAVCNFSM6AAAAABV7M5RZ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNZUGYZTCNZZGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
[image: alametti]*alametti* left a comment (gohugoio/hugo#9953)
<#9953 (comment)>
Add markup: pandoc to front matter
To clarify, this is because I did not set pandoc as the global markup
engine, only for select content.
—
Reply to this email directly, view it on GitHub
<#9953 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOAOD4QJBJBTDORBBXNEU32Q4WXNAVCNFSM6AAAAABV7M5RZ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNZUGYZTCNZZGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Why not change the file extension to |
Because I didn't know I could do that. Thanks for the tip - I'm not a hugo expert by any means! |
It is not yet available on hugo 0.145.0. |
e68d976
to
038007c
Compare
I changed the citeproc detection to use |
Hm, it seems the tests were unable to pull some data from X.com - I don't think that's related to my PR, though. Or am I mistaken? |
The X.com API has been unstable lately. |
You' re logging and returning errors, resulting in repetitive console output:
|
Adds the citeproc filter to the pandoc converter. There are several PRs for it this feature already. However, I think simply adding `--citeproc` is the cleanest way to enable this feature, with the option to flesh it out later, e.g., in gohugoio#7529. Some PRs and issues attempt adding more config options to Hugo which indirectly configure pandoc, but I think simply configuring Pandoc via Pandoc itself is simpler, as it is already possible with two YAML blocks -- one for Hugo, and one for Pandoc: --- title: This is the Hugo YAML block --- --- bibliography: assets/pandoc-yaml-block-bibliography.bib ... Document content with @citation! There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`, which works out of the box with this PR: --- title: This is the Hugo YAML block --- --- bibliography: assets/pandoc-yaml-block-bibliography.bib nocite: | @* ... Document content with no citations but a full bibliography: ## Bibliography Other useful options are `csl: ...` and `link-citations: true`, which set the path to a custom CSL file and create HTML links between the references and the bibliography. The following issues and PRs are related: - Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101 Bundles multiple requests, this PR tackles citation parsing. - WIP: Bibliography with Pandoc gohugoio#4800 Passes the frontmatter to Pandoc and still uses `--filter pandoc-citeproc` instead of `--citeproc`. - Allow configuring Pandoc gohugoio#7529 That PR is much more extensive and might eventually supersede this PR, but I think --bibliography and --citeproc should be independent options (--bibliography should be optional and citeproc can always be specified). - Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610 Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo. I think passing --citeproc and letting the users decide on the metadata they want to pass to pandoc is better, albeit uglier.
I removed the two log lines, is it better this way? Otherwise, can you share the example you are trying so that I can reproduce the behavior? |
When you ask a question such as, "Is it better this way?" it sounds like you are guessing, which doesn't inspire much confidence. The example I ran is indicated by the error message: I removed pandoc from the allow list. To answer your question, yes, it is better this way. |
@jmooring, I only ever run the tests with I don't know what example project you use to test the PR, even though it was clear to me that you removed pandoc from the allow list. Of course, I understand that I am not the ideal contributor to work on this because I haven't worked with Hugo since shortly after I originally drafted this PR in 2022, as such, I am not even using this feature myself anymore. I am trying my best to get a feel for the project and improve as I get suggestions, be it by you, @bep - who reviewed the first draft and helped me over a few Go hurdles -, or others in this thread. If you feel I am not confident enough on contributing this, I will of course step back from the PR, you can close it, and/or someone else can pick it up. |
Adds the citeproc filter to the pandoc converter.
There are several PRs for it this feature already. However, I think
simply adding
--citeproc
is the cleanest way to enable this feature,with the option to flesh it out later, e.g., in #7529.
Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:
There are other useful options, e.g., #4800 attempts to use
nocite
,which works out of the box with this PR:
Other useful options are
csl: ...
andlink-citations: true
, whichset the path to a custom CSL file and create HTML links between the
references and the bibliography.
The following issues and PRs are related:
Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension #6101
Bundles multiple requests, this PR tackles citation parsing.
WIP: Bibliography with Pandoc WIP: Bibliography with Pandoc #4800
Passes the frontmatter to Pandoc and still uses
--filter pandoc-citeproc
instead of--citeproc
.Allow configuring Pandoc Allow configuring Pandoc #7529
That PR is much more extensive and might eventually supersede this PR,
but I think --bibliography and --citeproc should be independent
options (--bibliography should be optional and citeproc can always be
specified).
Pandoc - allow citeproc extension to be invoked, with bibliography. Pandoc - allow citeproc extension to be invoked, with bibliography. #8610
Similar to Allow configuring Pandoc #7529, Pandoc - allow citeproc extension to be invoked, with bibliography. #8610 adds a new config option to Hugo.
I think passing --citeproc and letting the users decide on the
metadata they want to pass to pandoc is better, albeit uglier.
Note:
This PR also adds a tiny little bit of unrelated documentation to the external helpers by pointing out MathJax superficially. This certainly needs improvements, but is out of the scope for this PR.